Translation Probabilities in Cross-language Information Retrieval
نویسندگان
چکیده
Translation ambiguity is a major problem in dictionary-based cross-language information retrieval. To attack the problem, indirect disambiguation approaches, which do not explicitly resolve translation ambiguity, rely on query-structuring techniques such as a structured Boolean model and Pirkola’s method. Direct disambiguation approaches try to assign translation probabilities to translation equivalents, normally by employing co-occurrence statistics of target language terms from target documents as disambiguation clues. Thus far, translation probabilities have not been well explored in terms of statistical query translation models, query formulation, or cross-lingual retrieval models, etc. In order to study the impact of translation probabilities on retrieval effectiveness in direct disambiguation approaches, this paper empirically investigates the following issues: different disambiguation factors affecting the calculation of translation probabilities, the comparison of cross-lingual query formulation techniques involving translation probabilities, the relationship between the accuracy of translation disambiguation and retrieval effectiveness, and the relationship between top n translations and retrieval effectiveness.
منابع مشابه
Matching Meaning for Cross-Language Information Retrieval
This article describes a framework for cross-language information retrieval that efficiently leverages statistical estimation of translation probabilities. The framework provides a unified perspective into which some earlier work on techniques for cross-language information retrieval based on translation probabilities can be cast. Modeling synonymy and filtering translation probabilities using ...
متن کاملTranslation Resources, Merging Strategies, and Relevance Feedback for Cross-Language Information Retrieval
This paper describes the official runs of the Twenty-One group for the first CLEF workshop. The Twenty-One group participated in the monolingual, bilingual and multilingual tasks. The following new techniques are introduced in this paper. In the bilingual task we experimented with different methods to estimate translation probabilities. In the multilingual task we experimented with refinements ...
متن کاملStructured queries, language modeling, and relevance modeling in cross-language information retrieval
Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in a...
متن کاملDifferent approaches to Cross Language Information Retrieval
This paper describes two experiments in the domain of Cross Language Information Retrieval. Our basic approach is to translate queries word by word using machine readable dictionaries. The first experiment compared different strategies to deal with word sense ambiguity: i) keeping all translations and integrate translation probabilities in the model, ii) a single translation is selected on the ...
متن کاملQuery Translation Disambiguation as Graph Partitioning
Resolving ambiguity in the process of query translation is crucial to cross-language information retrieval when only a bilingual dictionary is available. In this paper we propose a novel approach for query translation disambiguation, named “spectral query translation model”. The proposed approach views the problem of query translation disambiguation as a graph partitioning problem. For a given ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Comput. Proc. Oriental Lang.
دوره 18 شماره
صفحات -
تاریخ انتشار 2005